clear environment

rm(list = ls())

load packages (install in you don’t have)

library(data.table)
Registered S3 method overwritten by 'data.table':
  method           from
  print.data.table     
data.table 1.13.6 using 1 threads (see ?getDTthreads).  Latest news: r-datatable.com
**********
This installation of data.table has not detected OpenMP support. It should still work but in single-threaded mode.
This is a Mac. Please read https://mac.r-project.org/openmp/. Please engage with Apple and ask them for support. Check r-datatable.com for updates, and our Mac instructions here: https://github.com/Rdatatable/data.table/wiki/Installation. After several years of many reports of installation problems on Mac, it's time to gingerly point out that there have been no similar problems on Windows or Linux.
**********
library(colourpicker) #addin useful for selecting colours
Registered S3 methods overwritten by 'htmltools':
  method               from         
  print.html           tools:rstudio
  print.shiny.tag      tools:rstudio
  print.shiny.tag.list tools:rstudio
Registered S3 method overwritten by 'htmlwidgets':
  method           from         
  print.htmlwidget tools:rstudio
library(cowplot) #theme for plotting
library(data.table) #package for manipulating and computing on data
library(raincloudplots) #https://wellcomeopenresearch.org/articles/4-63
Loading required package: ggplot2

packages that have data sets in them

library(Lahman) #has data sets related to baseball (AllstarFu ll and Pitching)
library(palmerpenguins) #has data sets related to penguins (penguins)

load in data

AllstarFull from Lahman package has baseball stats.

Pitching is another data set included with Lahman package and is more comprehensive.

palmerpenguins package has data sets related to penguins.

ggplot also has data set midwest included with it. Load this by doing data(“midwest”, package = “ggplot2”)

Need to copy and then convert the data to a data table

dat = copy(Pitching)
class(dat)
[1] "data.frame"
setDT(dat)
class(dat)
[1] "data.table" "data.frame"

examine data

notes on data

playerID: Player ID code

yearID: Year

stint: player’s stint (order of appearances within a season)

teamID: Team (factor)

lgID: League ID a factor with levels AA, AL, FL, NL, PL, UA

W: Wins

L: Losses

G: Games

GS: Games Started

CG: Complete Games

SHO: Shutouts

SV: Saves IPouts Outs Pitched (innings pitched x 3)

H: Hits

ER: Earned Runs

HR: Homeruns

BB: Walks

SO: Strikeouts

BAOpp: Opponent’s Batting Average

ERA: Earned Run Average

IBB: Intentional Walks

WP: Wild Pitches

HBP: Batters Hit By Pitch

BK: Balks

BFP: Batters faced by Pitcher

GF: Games Finished R Runs Allowed

SH: Sacrifices by opposing batters

SF: Sacrifice flies by opposing batters

GIDP: Grounded into double plays by opposing batter

Core elements of a ggplot plot:

(compiled from: https://ourcodingclub.github.io/tutorials/datavis/)

geom

Geometric object which defines the type of graph you are making.

It reads your data in the aesthetics mapping to know which variables to use, and creates the graph accordingly.

Some common types are:

aes

Short for aesthetics.

Usually placed within a geom_, this is where you specify your data source and variables, AND the properties of the graph which depend on those variables.

For instance, if you want all data points to be the same colour, you would define the ‘colour =’ argument outside the aes() function; if you want the data points to be coloured by a factor’s levels (e.g. by site or species), you specify the colour = argument inside the aes().

Some common things to include in aes are:

But note that different geoms have different aesthetics available (see cheatsheet below for example)

stat

a stat layer applies some statistical transformation to the underlying data: for instance, stat_smooth(method = ‘lm’) displays a linear regression line and confidence interval ribbon on top of a scatter plot (defined with geom_point()).

theme

A set of visual parameters that control the background, borders, grid lines, axes, text size, legend position, etc.

You can use pre-defined themes (e.g., theme_complot() from the cowplot package), create your own, or use a predefined theme and overwrite only the elements you don’t like.

Examples of elements within themes are:

e.g., axis.text.y = element_text(size = 12)

e.g., axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1)

[makes the x labels at an angle]

e.g., axis.title = element_text(size = 14, face = “plain”)

e.g., panel.grid = element_blank()

[Removes the background grid lines]

e.g., plot.margin = unit(c(1,1,1,1), units = , “cm”)

[Adds a 1cm margin around the plot]

e.g., legend.text = element_text(size = 12, face = “italic”)

[Setting the font for the legend text]

e.g., legend.title = element_blank()

[Remove the legend title - useful as sometimes this is excessive and the default is to include it]

e.g., legend.position = c(0.9, 0.9)))

+theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1),

axis.text.y = element_text(size = 12),

axis.title = element_text(size = 14, face = “plain”),

panel.grid = element_blank(),

plot.margin = unit(c(1,1,1,1), units = , “cm”),

legend.text = element_text(size = 12, face = “italic”),

legend.title = element_blank(),

legend.position = c(0.9, 0.9))

You define their properties with elements_…() functions. For example:

element_blank() would return something empty (ideal for removing background colour),

element_text(size = …, face = …, angle = …) lets you control all kinds of text properties.

# theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1), # making the years at a bit of an angle

#try a plot of home runs over year
ggplot(dat, aes(x=yearID, y=H))+geom_point()

#equivalent to
ggplot(dat)+geom_point(aes(x=yearID, y=H))

top tip: by encircling the ggplot in parenthesis () you get to assign a plot to a variable and plot it at the same time. useful if you want to save the plot or make it into a figure, refer to it later (e.g., replot, put in a panel with other figs) etc. Example here using the same plot as above

(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)))

remove grey background with +theme_bw()

(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_bw())

many other themes are available

(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_classic())


(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_minimal())


(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_cowplot())

you can also create your own theme!

Just write it as a function. Example here taken from: https://rpubs.com/jenrichmond/W6LL

#library(data.table)
#library(palmerpenguins)
#library(cowplot)
##library(ggplot)

theme_jen <- function () {
  
  # define font up front
  font <- "Helvetica"  
  # this theme uses theme_bw as the base 
  
  theme_bw() %+replace%   
    theme(
      #get rid of grid lines/borders
      panel.border = element_blank(), 
      panel.grid.major = element_blank(), 
      panel.grid.minor = element_blank(), 
      # add white space top, right, bottom, left
      plot.margin = unit(c(1, 1, 1, 1), "cm"), 
      # custom axis title/text/lines
      axis.title = element_text(            
        family = font,                     
        size = 14),               
      axis.text = element_text(              
        family = font,                       
        size = 12),   
      # margin pulls text away from axis
      axis.text.x = element_text(           
        margin=margin(5, b = 10)),
      # black lines
      axis.line = element_line(colour = "black", size = rel(1)), 
      # custom plot titles, subtitles, captions
      plot.title = element_text(             
        family = font,              
        size = 18,
        hjust = -0.1,
        vjust = 4),
       # custom plot subtitles
      plot.subtitle = element_text(          
        family = font,                   
        size = 14, 
        hjust = 0,
        vjust = 3),
       # custom captions
      plot.caption = element_text(           
        family = font,                   
        size = 10,
        hjust = 1,
        vjust = 2), 
      # custom legend 
      legend.title = element_text(          
        family = font,           
        size = 10,                
        hjust = 0), 
      legend.text = element_text(          
        family = font,               
        size = 8,                     
        hjust = 0), 
      #no background on legend
      legend.key = element_blank(),   
      # white background on plot
      strip.background = element_rect(fill = "white",  
                                      colour = "black", 
                                      size = rel(2)), complete = TRUE)
  
}
#source("theme_jen.R") # the script/function containing custom ggplot theme
(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_jen())

add label to x and y axis plus add in various elements of theme

(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + 
theme_classic()+
xlab('\nyear')+#\n adds blank line
ylab('n home runs')+ #\nadds blank line
theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1),  # making the years at a bit of an angle
axis.text.y = element_text(size = 12),
axis.title = element_text(size = 14, face = "plain"),                        
panel.grid = element_blank(),# Removing the background grid lines       
plot.margin = unit(c(1,1,1,1), units = , "cm"), # Adding a 1cm margin around the plot
legend.text = element_text(size = 12, face = "italic"), # Setting the font for the legend text
legend.title = element_blank(), # Removing the legend title
      legend.position = c(0.9, 0.9)))

might be claner to do the same plot on mean H per year

(plot1 = ggplot(dat[, .(H=mean(H)), by=yearID])+geom_point(aes(x=yearID, y=H)) + theme_classic()+
    xlab('\nyear')+            
    ylab('mean home runs per year')+          
    theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1),     
          axis.text.y = element_text(size = 12),
          axis.title = element_text(size = 14, face = "plain"),                        
          panel.grid = element_blank(),                                   
          plot.margin = unit(c(1,1,1,1), units = , "cm"),                 
          legend.text = element_text(size = 12, face = "italic"),         
          legend.title = element_blank(),                                 
          legend.position = c(0.9, 0.9)))

add a linear trendline using geom_smooth have to specficy method for this (method=“lm” or method=lm is fine). se is added by default (can add se=F to disable this)

(plot1 = ggplot(dat[, .(H=mean(H)), by=yearID], aes(x=yearID, y=H))+
    geom_point()+
    geom_smooth(method=lm)+
    theme_classic()+
    xlab('\nyear')+
    ylab('mean home runs per year')+
   theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1), axis.text.y = element_text(size = 12), axis.title = element_text(size = 14, face = "plain"), panel.grid = element_blank(),plot.margin = unit(c(1,1,1,1), units = , "cm"),             legend.text = element_text(size = 12, face = "italic"),       legend.title = element_blank(),legend.position = c(0.9, 0.9)))
`geom_smooth()` using formula 'y ~ x'

you can also add a specific formula in geom_smooth (e.g., y~x+x2+x3)

(plot1 = ggplot(dat[, .(H=mean(H)), by=yearID], aes(x=yearID, y=H))+
    geom_point()+
    geom_smooth(formula=y~x+x^2+x^3)+
    theme_classic()+
    xlab('\nyear')+
    ylab('mean home runs per year')+            
    theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1),
          axis.text.y = element_text(size = 12),
          axis.title = element_text(size = 14, face = "plain"),                        
          panel.grid = element_blank(),                                   
          plot.margin = unit(c(1,1,1,1), units = , "cm"),                 
          legend.text = element_text(size = 12, face = "italic"),         
          legend.title = element_blank(),                                 
          legend.position = c(0.9, 0.9)))
`geom_smooth()` using method = 'loess'

facet wrap this can be used to easily plot data in panels (e.g., plot mean home runs over time for each leagueID - here I also distinguish leagues by colour)/ seeing scales = “free_y” below means the y axis can vary from plot to plot. You can also use nrow = or ncol = to specify the numbers of rows/columns

dat$yearIDfact = as.factor(dat$yearID)

(plot1 = ggplot(dat[, .(H=mean(H)), by=.(lgID, yearID)], aes(x=yearID, y=H, colour=lgID))+
    geom_point()+
    facet_wrap(vars(lgID), scales = "free_y")+
    theme_classic()+
    xlab('\nyear')+            #\n adds blank line
    ylab('mean home runs per year'))

facet_grid does a similar thing but organised into columns of rows

here use rows based on teamID

(plot1 = ggplot(dat[, .(H=mean(H)), by=.(lgID, yearID)], aes(x=yearID, y=H, colour=lgID))+
    geom_point()+
    facet_grid(lgID ~ .)+
    theme_classic()+
    xlab('\nyear')+            #\n adds blank line
    ylab('mean home runs per year'))

columns based on teamID

(plot1 = ggplot(dat[, .(H=mean(H)), by=.(lgID, yearID)], aes(x=yearID, y=H, colour=lgID))+
    geom_point()+
    facet_grid(. ~ lgID)+
    theme_classic()+
    xlab('\nyear')+            #\n adds blank line
    ylab('mean home runs per year'))

bar plots

bar plots with error bars and individual data points

A special subcategory as this is the most common plot I end up having to do.

Note on data wrangling

box plots

exercises

ggplot cheatsheet

ggplot cheatsheet ggplot cheatsheet

---
title: "ggplot_tickstricks"
output: html_notebook
editor_options: 
  markdown: 
    wrap: 72
---

### clear environment

```{r}
rm(list = ls())
```

### load packages (install in you don't have)

```{r}
library(data.table)
library(colourpicker) #addin useful for selecting colours
library(cowplot) #theme for plotting
library(data.table) #package for manipulating and computing on data
library(raincloudplots) #https://wellcomeopenresearch.org/articles/4-63
```

### packages that have data sets in them

```{r}
library(Lahman) #has data sets related to baseball (Allstar and Pitching)
library(palmerpenguins) #has data sets related to penguins (penguins)

```

### load in data

AllstarFull from Lahman package has baseball stats.

Pitching is another data set included with Lahman package and is more
comprehensive.

palmerpenguins package has data sets related to penguins.

ggplot also has data set midwest included with it. Load this by doing
*data("midwest", package = "ggplot2")*

Need to copy and then convert the data to a data table

```{r}
dat = copy(Pitching)
class(dat)
setDT(dat)
class(dat)
```

examine data

```{r}
head(dat)
str(dat)
```

### **notes on data**

**playerID:** Player ID code

**yearID:** Year

**stint:** player's stint (order of appearances within a season)

**teamID:** Team (factor)

**lgID:** League ID a factor with levels AA, AL, FL, NL, PL, UA

**W:** Wins

**L:** Losses

**G:** Games

**GS:** Games Started

**CG:** Complete Games

**SHO:** Shutouts

**SV:** Saves IPouts Outs Pitched (innings pitched x 3)

**H:** Hits

**ER:** Earned Runs

**HR:** Homeruns

**BB:** Walks

**SO:** Strikeouts

**BAOpp:** Opponent's Batting Average

**ERA:** Earned Run Average

**IBB:** Intentional Walks

**WP:** Wild Pitches

**HBP:** Batters Hit By Pitch

**BK:** Balks

**BFP:** Batters faced by Pitcher

**GF:** Games Finished R Runs Allowed

**SH:** Sacrifices by opposing batters

**SF:** Sacrifice flies by opposing batters

**GIDP:** Grounded into double plays by opposing batter

### **Core elements of a ggplot plot:**

(compiled from: <https://ourcodingclub.github.io/tutorials/datavis/>)

**geom**

Geometric object which defines the type of graph you are making.

It reads your data in the aesthetics mapping to know which variables to
use, and creates the graph accordingly.

Some common types are:

-   geom_point()

-   geom_boxplot()

-   geom_histogram()

-   geom_col()

**aes**

Short for aesthetics.

Usually placed within a geom\_, this is where you specify your data
source and variables, AND the properties of the graph which depend on
those variables.

For instance, if you want all data points to be the same colour, you
would define the 'colour =' argument *outside* the aes() function; if
you want the data points to be coloured by a factor's levels (e.g. by
site or species), you specify the colour = argument *inside* the aes().

Some common things to include in aes are:

-   x

-   y

-   fill

-   colour

-   size

-   shape

**But** note that different geoms have different aesthetics available
(see cheatsheet below for example)

**stat**

a stat layer applies some statistical transformation to the underlying
data: for instance, stat_smooth(method = 'lm') displays a linear
regression line and confidence interval ribbon on top of a scatter plot
(defined with geom_point()).

**theme**

A set of visual parameters that control the background, borders, grid
lines, axes, text size, legend position, etc.

You can use pre-defined themes (e.g., theme_complot() from the cowplot
package), create your own, or use a predefined theme and overwrite only
the elements you don't like.

Examples of elements within themes are:

-   **axis.text**

e.g., axis.text.y = element_text(size = 12)

e.g., axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust
= 1)

[makes the x labels at an angle]

-   **axis.title**

e.g., axis.title = element_text(size = 14, face = "plain")

-   **panel.grid**

e.g., panel.grid = element_blank()

[Removes the background grid lines]

-   **plot.margin**

e.g., plot.margin = unit(c(1,1,1,1), units = , "cm")

[Adds a 1cm margin around the plot]

-   **legend text**

e.g., legend.text = element_text(size = 12, face = "italic")

[Setting the font for the legend text]

-   **legend.title**

e.g., legend.title = element_blank()

[Remove the legend title - useful as sometimes this is excessive and the
default is to include it]

-   **legend position**

e.g., legend.position = c(0.9, 0.9)))

-   **putting it all together...**

+theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1,
hjust = 1),

axis.text.y = element_text(size = 12),

axis.title = element_text(size = 14, face = "plain"),

panel.grid = element_blank(),

plot.margin = unit(c(1,1,1,1), units = , "cm"),

legend.text = element_text(size = 12, face = "italic"),

legend.title = element_blank(),

legend.position = c(0.9, 0.9))

You define their properties with elements\_...() functions. For example:

element_blank() would return something empty (ideal for removing
background colour),

element_text(size = ..., face = ..., angle = ...) lets you control all
kinds of text properties.

\# theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1,
hjust = 1), \# making the years at a bit of an angle

```{r}
#try a plot of home runs over year
ggplot(dat, aes(x=yearID, y=H))+geom_point()
#equivalent to
ggplot(dat)+geom_point(aes(x=yearID, y=H))
```

top tip: by encircling the ggplot in parenthesis () you get to assign a
plot to a variable and plot it at the same time. useful if you want to
save the plot or make it into a figure, refer to it later (e.g., replot,
put in a panel with other figs) etc. Example here using the same plot as
above

```{r}
(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)))
```

remove grey background with +theme_bw()

```{r}
(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_bw())
```

many other themes are available

```{r}
(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_classic())

(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_minimal())

(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_cowplot())
```

you can also create your own theme!

Just write it as a function. Example here taken from:
<https://rpubs.com/jenrichmond/W6LL>

```{r}
#library(data.table)
#library(palmerpenguins)
#library(cowplot)
##library(ggplot)

theme_jen <- function () {
  
  # define font up front
  font <- "Helvetica"  
  # this theme uses theme_bw as the base 
  
  theme_bw() %+replace%   
    theme(
      #get rid of grid lines/borders
      panel.border = element_blank(), 
      panel.grid.major = element_blank(), 
      panel.grid.minor = element_blank(), 
      # add white space top, right, bottom, left
      plot.margin = unit(c(1, 1, 1, 1), "cm"), 
      # custom axis title/text/lines
      axis.title = element_text(            
        family = font,                     
        size = 14),               
      axis.text = element_text(              
        family = font,                       
        size = 12),   
      # margin pulls text away from axis
      axis.text.x = element_text(           
        margin=margin(5, b = 10)),
      # black lines
      axis.line = element_line(colour = "black", size = rel(1)), 
      # custom plot titles, subtitles, captions
      plot.title = element_text(             
        family = font,              
        size = 18,
        hjust = -0.1,
        vjust = 4),
       # custom plot subtitles
      plot.subtitle = element_text(          
        family = font,                   
        size = 14, 
        hjust = 0,
        vjust = 3),
       # custom captions
      plot.caption = element_text(           
        family = font,                   
        size = 10,
        hjust = 1,
        vjust = 2), 
      # custom legend 
      legend.title = element_text(          
        family = font,           
        size = 10,                
        hjust = 0), 
      legend.text = element_text(          
        family = font,               
        size = 8,                     
        hjust = 0), 
      #no background on legend
      legend.key = element_blank(),   
      # white background on plot
      strip.background = element_rect(fill = "white",  
                                      colour = "black", 
                                      size = rel(2)), complete = TRUE)
  
}
```

```{r}
#source("theme_jen.R") # the script/function containing custom ggplot theme
(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + theme_jen())
```

add label to x and y axis plus add in various elements of theme

```{r}
(plot1 = ggplot(dat)+geom_point(aes(x=yearID, y=H)) + 
theme_classic()+
xlab('\nyear')+#\n adds blank line
ylab('n home runs')+ #\nadds blank line
theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1), # making the years at a bit of an angle
axis.text.y = element_text(size = 12),
axis.title = element_text(size = 14, face = "plain"),                        
panel.grid = element_blank(),# Remove the background grid lines       
plot.margin = unit(c(1,1,1,1), units = , "cm"), # Add a 1cm margin around the plot
legend.text = element_text(size = 12, face = "italic"), # Setting the font for the legend text
legend.title = element_blank(), # Removing the legend title
      legend.position = c(0.9, 0.9)))
```

might be claner to do the same plot on mean H per year

```{r}
(plot1 = ggplot(dat[, .(H=mean(H)), by=yearID])+geom_point(aes(x=yearID, y=H)) + theme_classic()+
    xlab('\nyear')+            
    ylab('mean home runs per year')+          
    theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1),     
          axis.text.y = element_text(size = 12),
          axis.title = element_text(size = 14, face = "plain"),                        
          panel.grid = element_blank(),                                   
          plot.margin = unit(c(1,1,1,1), units = , "cm"),                 
          legend.text = element_text(size = 12, face = "italic"),         
          legend.title = element_blank(),                                 
          legend.position = c(0.9, 0.9)))
```

add a linear trendline using geom_smooth have to specficy method for
this (method="lm" or method=lm is fine). se is added by default (can add
se=F to disable this)

```{r}
(plot1 = ggplot(dat[, .(H=mean(H)), by=yearID], aes(x=yearID, y=H))+
    geom_point()+
    geom_smooth(method=lm)+
    theme_classic()+
    xlab('\nyear')+
    ylab('mean home runs per year')+
   theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1), axis.text.y = element_text(size = 12), axis.title = element_text(size = 14, face = "plain"), panel.grid = element_blank(),plot.margin = unit(c(1,1,1,1), units = , "cm"),             legend.text = element_text(size = 12, face = "italic"),       legend.title = element_blank(),legend.position = c(0.9, 0.9)))
```

you can also add a specific formula in geom_smooth (e.g., y\~x+x^2+x^3)

```{r}
(plot1 = ggplot(dat[, .(H=mean(H)), by=yearID], aes(x=yearID, y=H))+
    geom_point()+
    geom_smooth(formula=y~x+x^2+x^3)+
    theme_classic()+
    xlab('\nyear')+
    ylab('mean home runs per year')+            
    theme(axis.text.x = element_text(size = 12, angle = 45, vjust = 1, hjust = 1),
          axis.text.y = element_text(size = 12),
          axis.title = element_text(size = 14, face = "plain"),
          panel.grid = element_blank(),
          plot.margin = unit(c(1,1,1,1), units = , "cm"),
          legend.text = element_text(size = 12, face = "italic"),
          legend.title = element_blank(),
          legend.position = c(0.9, 0.9)))
```

facet wrap this can be used to easily plot data in panels (e.g., plot
mean home runs over time for each leagueID - here I also distinguish
leagues by colour)/ seeing scales = "free_y" below means the y axis can
vary from plot to plot. You can also use `nrow =` or `ncol =` to specify
the numbers of rows/columns

```{r}
dat$yearIDfact = as.factor(dat$yearID)

(plot1 = ggplot(dat[, .(H=mean(H)), by=.(lgID, yearID)], aes(x=yearID, y=H, colour=lgID))+
    geom_point()+
    facet_wrap(vars(lgID), scales = "free_y")+
    theme_classic()+
    xlab('\nyear')+            #\n adds blank line
    ylab('mean home runs per year'))
```

facet_grid does a similar thing but organised into columns of rows

here use rows based on teamID

```{r}
(plot1 = ggplot(dat[, .(H=mean(H)), by=.(lgID, yearID)], aes(x=yearID, y=H, colour=lgID))+
    geom_point()+
    facet_grid(lgID ~ .)+
    theme_classic()+
    xlab('\nyear')+            #\n adds blank line
    ylab('mean home runs per year'))
```

columns based on teamID

```{r}
(plot1 = ggplot(dat[, .(H=mean(H)), by=.(lgID, yearID)], aes(x=yearID, y=H, colour=lgID))+
    geom_point()+
    facet_grid(. ~ lgID)+
    theme_classic()+
    xlab('\nyear')+            #\n adds blank line
    ylab('mean home runs per year'))
```

### **bar plots**

### **bar plots with error bars and individual data points**

A special subcategory as this is the most common plot I end up having to
do.

**Note on data wrangling**

### **box plots**

### **exercises**

### **Resources/Links**

Non exhaustive list of links/resources I've used in the course of
compiling this notebook

<https://rpubs.com/jenrichmond/W6LL>

<https://rafalab.github.io/dsbook/ggplot2.html>

<http://r-statistics.co/Complete-Ggplot2-Tutorial-Part1-With-R-Code.html>

<https://ourcodingclub.github.io/tutorials/datavis/>

<https://ourcodingclub.github.io/tutorials/data-vis-2/>

<https://ourcodingclub.github.io/tutorials/qualitative/>

### **ggplot cheatsheet**

![ggplot cheatsheet](images/ggplot2-cheatsheeta.png) ![ggplot
cheatsheet](images/ggplot2-cheatsheetb.png)
